Constituent Structure for Filipino: Induction through Probabilistic Approaches
نویسندگان
چکیده
The current state of Philippine linguistic resources, which includes formal grammars, electronic dictionaries and corpora are not yet significant to address industrialstrength language technologies. This paper discusses a computational approach in automatically estimating constituent structures from a corpus using unsupervised probabilistic approaches. Two models are presented and results show an F1 measure of greater than 69%. Issues and phenomena of the Filipino language are identified and discussed
منابع مشابه
A Trust Based Probabilistic Method for Efficient Correctness Verification in Database Outsourcing
Correctness verification of query results is a significant challenge in database outsourcing. Most of the proposed approaches impose high overhead, which makes them impractical in real scenarios. Probabilistic approaches are proposed in order to reduce the computation overhead pertaining to the verification process. In this paper, we use the notion of trust as the basis of our probabilistic app...
متن کاملNatural Language Grammar Induction Using a Constituent-Context Model
This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text. Most previous work has focused on maximizing likelihood according to generative PCFG models. In contrast, we employ a simpler probabilistic model over trees based directly on constituent identity and linear context, and use an EM-like iterative procedure to induce structure. This me...
متن کاملLearning Translation Rules for a Bidirectional English-Filipino Machine Translator
Filipino is a changing language that poses several challenges. Our goal is to develop a bidirectional English-Filipino Machine Translation (MT) system using a hybrid approach to learn rules from examples. The first phase was an English to Filipino MT system that required several language resources. The problem lies on its dependency over the annotated grammar which is currently unavailable for ...
متن کاملAlternative approaches to obtain t-norms and t-conorms on bounded lattices
Triangular norms in the study of probabilistic metric spaces as a special kind of associative functions defined on the unit interval. These functions have found applications in many areas since then. In this study, we present new methods for constructing triangular norms and triangular conorms on an arbitrary bounded lattice under some constraints. Also, we give some illustrative examples for t...
متن کاملA Greedy Approach to Unsupervised Grammar Induction for Filipino
Copyright 2008 ABSTRACT This paper discusses the Greedy Merge Model used for an unsupervised grammar induction system for the Filipino language. The approach attempts to address the current state of Philippine linguistic resources, specifically the formal grammars, which are insubstantial for robust analysis. The Greedy Merge Model results show an F1 measure of 69%. Generated grammar rules are ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008